Collapsing ROC approach for risk prediction research on both common and rare variants
نویسندگان
چکیده
Risk prediction that capitalizes on emerging genetic findings holds great promise for improving public health and clinical care. However, recent risk prediction research has shown that predictive tests formed on existing common genetic loci, including those from genome-wide association studies, have lacked sufficient accuracy for clinical use. Because most rare variants on the genome have not yet been studied for their role in risk prediction, future disease prediction discoveries should shift toward a more comprehensive risk prediction strategy that takes into account both common and rare variants. We are proposing a collapsing receiver operating characteristic (CROC) approach for risk prediction research on both common and rare variants. The new approach is an extension of a previously developed forward ROC (FROC) approach, with additional procedures for handling rare variants. The approach was evaluated through the use of 533 single-nucleotide polymorphisms (SNPs) in 37 candidate genes from the Genetic Analysis Workshop 17 mini-exome data set. We found that a prediction model built on all SNPs gained more accuracy (AUC = 0.605) than one built on common variants alone (AUC = 0.585). We further evaluated the performance of two approaches by gradually reducing the number of common variants in the analysis. We found that the CROC method attained more accuracy than the FROC method when the number of common variants in the data decreased. In an extreme scenario, when there are only rare variants in the data, the CROC reached an AUC value of 0.603, whereas the FROC had an AUC value of 0.524.
منابع مشابه
Detecting functional rare variants by collapsing and incorporating functional annotation in Genetic Analysis Workshop 17 mini-exome data
Association studies using tag SNPs have been successful in detecting disease-associated common variants. However, common variants, with rare exceptions, explain only at most 5-10% of the heritability resulting from genetic factors, which leads to the common disease/rare variants assumption. Indeed, recent studies using sequencing technologies have demonstrated that common diseases can be due to...
متن کاملMethods for detecting associations with rare variants for common diseases: application to analysis of sequence data.
Although whole-genome association studies using tagSNPs are a powerful approach for detecting common variants, they are underpowered for detecting associations with rare variants. Recent studies have demonstrated that common diseases can be due to functional variants with a wide spectrum of allele frequencies, ranging from rare to common. An effective way to identify rare variants is through di...
متن کاملChallenges and directions: an analysis of Genetic Analysis Workshop 17 data by collapsing rare variants within family data
Recent studies suggest that the traditional case-control study design does not have sufficient power to discover rare risk variants. Two different methods-collapsing and family data-are suggested as alternatives for discovering these rare variants. Compared with common variants, rare variants have unique characteristics. In this paper, we assess the distribution of rare variants in family data....
متن کاملImproved power by collapsing rare and common variants based on a data-adaptive forward selection strategy
Genome-wide association studies have been used successfully to detect associations between common genetic variants and complex diseases, but common single-nucleotide polymorphisms (SNPs) detected by these studies explain only 5-10% of disease heritability. Alternatively, the common disease/rare variants hypothesis suggests that complex diseases are often caused by multiple rare variants with mo...
متن کاملIdentification of functional rare variants in genome-wide association studies using stability selection based on random collapsing
Genome-wide association studies are a powerful approach used to identify common variants for complex disease. However, the traditional genome-wide association methods may not be optimal when they are applied to rare variants because of the rare variants' low frequencies and weak signals. To alleviate the difficulty, investigators have proposed many methods that collapse rare variants. In this p...
متن کامل